Distributed Resource Management for Parallel Applications in Networks of Workstations
نویسندگان
چکیده
Running parallel applications in a network of workstations (NOW) requires the use of a resource management system with batch queueing and load balancing functionalities to utilize idle workstations in the NOW and to avoid load imbalance in the network. A resource management system for parallel jobs requires special func-tionalities to schedule jobs to hosts and to support checkpointing and migration of parallel applications. This paper describes the essential components of a distributed resource management system supporting parallel computations in a NOW and how to reuse existing resource management components for this approach. The implementation of a distributed resource manager demonstrates the practical relevance of the design concept 1 .
منابع مشابه
Efficient Resource Management for Sci- Entific Applications in Distributed Com- Puting Environment
Resource management for scientiic applications in a vast distributed computing environment , such as the worldwide interconnected networks (Internet, Web), is a complex problem. This paper investigates various techniques used to manage resources in a network of workstations at both coarse and ne levels of granularity. A new strategy which combines two competitive techniques, one from each granu...
متن کاملScheduling Parallel Applications in Networks of Mixed Uniprocessor/Multiprocessor Workstations
Trying to exploit the idle computing power of workstation networks for parallel applications requires means for dynamic workload scheduling. In this paper, we present the features of the Winner resource management system developed for this purpose. Winner relies on an elaborate technique for accurately measuring the currently available computing speed of a workstation, particularly in the prese...
متن کاملEecient Resource Management for Scientiic Applica- Tions in Distributed Computing Environment
Resource management for scientiic applications in a vast distributed computing environment , such as the worldwide interconnected networks (Internet, Web), is a complex problem. This paper investigates various techniques used to manage resources in a network of workstations at both coarse and ne levels of granularity. A new strategy which combines two competitive techniques, one from each granu...
متن کاملDealing with Heterogeneity in Stardust: An Environment for Parallel Programming on Networks of Heterogeneous Workstations
This paper describes the management of heterogeneity in Stardust, an environment for parallel programming above networks of heterogeneous machines, which can include distributed memory multi-computers and networks of workstations. Applications using Stardust can communicate both through message passing and distributed shared memory. Stardust is currently implemented on an heterogeneous system i...
متن کاملSoftware Engineering Methods for Parallel and Distributed Scientific Computing
In this paper, we present an interdisciplinary research project whose central objective is to develop new software engineering (SWE) methods for distributed memory parallel scientific computing. Our emphasis is on putting into practice and evaluating the proposed methods. The main test case for their definition and evaluation is the parallelization of an industrial CFD software package. A major...
متن کامل